An index is a special kind of file used by the Indexing Kit, called a store file, which has an IXStoreDirectory containing an IXFileFinder (named ``FileFinder''). The IXFileFinder is responsible for actual manipulation of the indexes, and is accessible through the IXStoreDirectory by applications that use the Indexing Kit, whose documentation is available online in Digital Librarian.
ixbuild makes use of several special files when first creating an index. The contents of these files are incorporated into the index itself, so they aren't referenced when an index is updated. However, if the index is deleted, and rebuilt from scratch, these files will be used again, so you may not want to delete them. Here are brief descriptions of the files, their uses, and formats:
.index.ftype contains information about the types of files that will be included in the index. A file's type is used to determine how tokens (words) should be extracted from it, or how to convert it to a form that the Indexing Kit can index. Each line in this file should be of the form:
Each field must be separated from the next by exactly one tab. Any field may be ``-'', in which case the field won't be used. typename is the name that should be used for the type; for example, ``man'' or ``ps''. pattern is a sequence of characters within a file that may be used to identify it (for example, ``%!PS''); if pattern begins with a `/', or if the format is regex (see below) it's interpreted as a regular expression. format is the data type of pattern; it may be one of byte, short, long, regex, or string. string is the default format. offset is the unit offset into the file at which pattern is expected to occur. The unit is that of formatR; that is, if format is long, offset is measured in amounts of 4 bytes. filename is a filename that should be matched to the type; it may contain wildcards (for example, ``*.rtf''). This might be the ftype entry for PostScript files, for example:
.index.itype contains the names of types of files (as defined in .index.ftype) that will not be included in the index. Each type name should be on a separate line.
.index.iname contains the base names (without paths) of files that will not be included in the index. The filename must be exact; shell wildcards are not allowed. Each file name should be on a separate line.
.index.swords contains stop words, which will not be included in the index. Each word should be on a separate line, and should be in post-processed form (that is, if you use case folding, all stop words should be lowercase, and if you use stem reduction, all words should be stems only).
.index.domain contains a weighting domain used for peculiarity weighting (see the IXWeightingDomain and IXAttributeParser class specifications in the Indexing Kit documentation). You can use the ixparse(1) command to convert histogram or NEXTSTEP Release 2 WFTable files to domain format.
.index.store an index file created by ixbuild .index.ftype file type table .index.iname ignored file names .index.itype ignored file types .index.swords stop words (dropped from index) .index.domain weighting domain